{ "cells": [ { "cell_type": "markdown", "id": "9cd8c73a-a6b2-4a2c-acc3-ea96f365a2b5", "metadata": {}, "source": [ "### Exploring Pipeline APIs on Blueshift Notebooks\n", "\n", "The pipeline APIs on Blueshift provides a systematic way to analyse a large universe of instruments. In this notebook, we will set up a simple notebook to understand how to create and use pipelines.\n", "\n", "The first thing to do is to import the necessary classes from the `blueshift.pipeline` module, namely the `Pipeline` constructor, the `CustomFilter` and the `CustomFactor`. The last two are required as we want to define our own filters and factors (instead of using the built-in ones). We also need to import `EquityPricing` - this is a column definition that binds the pipeline to its required inputs. Let's define a custom filter and a custom factor as show below." ] }, { "cell_type": "code", "execution_count": 1, "id": "211d7ac3-084e-4528-b564-0c74727bb949", "metadata": {}, "outputs": [], "source": [ "from blueshift.research import use_dataset, run_pipeline\n", "\n", "from blueshift.pipeline import Pipeline, CustomFilter, CustomFactor\n", "from blueshift.pipeline.data import EquityPricing\n", "\n", "class TypicalPriceUp(CustomFilter):\n", " inputs = [EquityPricing.high, EquityPricing.low, EquityPricing.close]\n", " \n", " def compute(self,today,assets,out, high_price, low_price, close_price):\n", " typical = (high_price + low_price + close_price)/3\n", " out[:] = typical[-1] > typical[-2]\n", "\n", "class PeriodReturns(CustomFactor):\n", " inputs = [EquityPricing.close]\n", " \n", " def compute(self,today,assets,out, close_price):\n", " returns = close_price[-1]/close_price[0] - 1\n", " out[:] = returns" ] }, { "cell_type": "markdown", "id": "c704f018-5625-40b3-b18f-3590d7616a6a", "metadata": {}, "source": [ "In the above custom factor (`PeriodReturns`), we define the inputs to the pipeline computation to be only the \"close\" price column. However, the filter (`TypicalPriceUp`) requires three pricing columns. These are defined as the respective class variables ``inputs``. The pipeline class, when running the computation, automatically refers to this `inputs` class variables and passes on the required columns to the `compute` method. As a result, the `compute` method for the filter expects three extra pricing columns (apart from the always preset `today`, `assets` and `out` parameters) and the same for the factor has only one extra pricing column. With this, we simply write the required logic in the `compute` function and make sure the results are put back in the provided `out` parameter. Note: the filter must return `True` or `False` from its computations and the factors output should real values. \n", "\n", "Now lets create a pipeline and run it" ] }, { "cell_type": "code", "execution_count": 2, "id": "0610d103-0673-45f4-94b1-8d5934679516", "metadata": {}, "outputs": [], "source": [ "pipe = Pipeline()\n", "pipe.add(PeriodReturns(window_length=10), 'returns')\n", "pipe.set_screen(TypicalPriceUp(window_length=5))" ] }, { "cell_type": "markdown", "id": "c90445c8-9112-4555-a636-bb49496d3f11", "metadata": {}, "source": [ "We have added the `PeriodReturns` factor using the `add` method, and named it simply \"returns\". Also we used the `set_screen` method to add the filter `TypicalPriceUp` as well. Now let's run the pipeline over some period." ] }, { "cell_type": "code", "execution_count": 3, "id": "265a4c90-026e-43d8-bd2d-7fb29c321452", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
returns
2020-10-12 00:00:00+05:30Equity(3MINDIA [3])0.072567
Equity(AARTIDRUGS [7])0.324145
Equity(AAVAS [10])0.061269
Equity(ABB [12])0.028202
Equity(ABBOTINDIA [13])-0.006489
.........
2020-10-15 00:00:00+05:30Equity(WINPRO [710])0.001053
Equity(XCHANGING [1492])-0.050919
Equity(ZODIACLOTH [1503])-0.001485
Equity(ZOTA [1505])-0.008174
Equity(ZYDUSWELL [1508])-0.008715
\n", "

1437 rows × 1 columns

\n", "
" ], "text/plain": [ " returns\n", "2020-10-12 00:00:00+05:30 Equity(3MINDIA [3]) 0.072567\n", " Equity(AARTIDRUGS [7]) 0.324145\n", " Equity(AAVAS [10]) 0.061269\n", " Equity(ABB [12]) 0.028202\n", " Equity(ABBOTINDIA [13]) -0.006489\n", "... ...\n", "2020-10-15 00:00:00+05:30 Equity(WINPRO [710]) 0.001053\n", " Equity(XCHANGING [1492]) -0.050919\n", " Equity(ZODIACLOTH [1503]) -0.001485\n", " Equity(ZOTA [1505]) -0.008174\n", " Equity(ZYDUSWELL [1508]) -0.008715\n", "\n", "[1437 rows x 1 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "use_dataset('nse')\n", "\n", "results = run_pipeline(pipe, '2020-10-10', '2020-10-15')\n", "results" ] }, { "cell_type": "markdown", "id": "4d0c85c2-6320-4bf1-a94e-d083f8114703", "metadata": {}, "source": [ "The input to the pipeline `compute` function is automatically computed (based on the all surviving `assets` on the day of computation and the required pricing fields as implied by the `inputs` class variable). The output is a multi-index dataframe with compute date as the first level index and the assets (passing the filter on that day) as the second level. The columns (if any) will be the factors that were `add`-ed to the pipe. We can easily subset for further analysis, for example the output on 12th Oct is as below" ] }, { "cell_type": "code", "execution_count": 4, "id": "16735a25-aca5-42fb-b972-37ad6e94b5b3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
returns
Equity(3MINDIA [3])0.072567
Equity(AARTIDRUGS [7])0.324145
Equity(AAVAS [10])0.061269
Equity(ABB [12])0.028202
Equity(ABBOTINDIA [13])-0.006489
......
Equity(WONDERLA [1490])0.070835
Equity(ZENTEC [1501])0.091623
Equity(ZOTA [1505])0.002055
Equity(ZYDUSLIFE [221])0.130435
Equity(ZYDUSWELL [1508])-0.005943
\n", "

377 rows × 1 columns

\n", "
" ], "text/plain": [ " returns\n", "Equity(3MINDIA [3]) 0.072567\n", "Equity(AARTIDRUGS [7]) 0.324145\n", "Equity(AAVAS [10]) 0.061269\n", "Equity(ABB [12]) 0.028202\n", "Equity(ABBOTINDIA [13]) -0.006489\n", "... ...\n", "Equity(WONDERLA [1490]) 0.070835\n", "Equity(ZENTEC [1501]) 0.091623\n", "Equity(ZOTA [1505]) 0.002055\n", "Equity(ZYDUSLIFE [221]) 0.130435\n", "Equity(ZYDUSWELL [1508]) -0.005943\n", "\n", "[377 rows x 1 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "results.xs(pd.Timestamp('2020-10-12', tz='Asia/Calcutta'))" ] }, { "cell_type": "markdown", "id": "82ac7a0c-03d9-48ba-aeeb-0b92891bf26e", "metadata": {}, "source": [ "When you are using pipline APIs in a strategy, this is exactly what you get as the returned value from the `pipeline_output` API function - a dataframe like above computed for the current date for the strategy.\n", "\n", "Pipline APIs provide a powerful way to anaylze large universe, including factor and ML strategies. Now that you know how to create and use pipeline, feel free to explore more!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 5 }